-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sharing CpuAdressableDisplayProvider between eglstreams and gbm platforms #3067
Conversation
This'll get Mir starting, but it won't actually display anything. To get the CPU buffers on the screen, you'll need to additionally touch Here is where that is currently done, and specifically here is where the KMS FB ID is used. |
Oh that's the |
And does that mean that we sometimes use the |
Yes. When there's a non-null As a practical matter I think any given instance is only ever going to use one of these paths - either always direct KMS, or always EGLStream. |
Extra bits of relevance: |
I'm getting a permission denied error on |
6a201e0
to
90649d9
Compare
Codecov Report
@@ Coverage Diff @@
## main #3067 +/- ##
==========================================
- Coverage 78.43% 78.36% -0.07%
==========================================
Files 1073 1074 +1
Lines 74717 74804 +87
==========================================
+ Hits 58602 58623 +21
- Misses 16115 16181 +66
... and 7 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I've not seen this one elsewhere:
|
Correction. As one could expect, this happens occasionally on platform-API-merge as well. |
{ | ||
auto const predicted_render_time = 50ms; // Predicted worst case render time for the next frame... | ||
|
||
output->set_crtc(*next_swap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will do an immediate, tearing CRTC update (ie: it will not wait for vsync). I don't think this will trigger the flip event handler, either?
Hm. It's not obvious why that would be the case. I was wondering whether we'd need to not do the EGLStream setup if we're using the CPUAllocated path. That is, defer the But I wouldn't expect that to manifest as permission denied. I'll test this and see what I can immediately dig up tomorrow. |
Sounds good let me know 👍 |
With Wait - I could have sworn that a previous version of this set With that fixed, it then doesn't crash, but it does hang indefinitely in |
OK. The last commit I pushed almost works (and is not suitable for merging). Everything seems to work - post() doesn't block, no exceptions are raised, the frame events get signalled - except that nothing appears on the screen. 🤦♀️ |
https://gitlab.freedesktop.org/daniels/kms-quads/-/blob/master/kms.c is a pretty good reference for Atomic KMS shenanigans. |
Reproduced this on my local setup at least 😄 Although I'm getting a mysterious error:
Correction: I can flash stuff on the screen (e.g. gedit), but it is clearly mildly broken... Update: Obviously the cursor update on gedit is making it pop on the screen from time to time. The screens seems to have no persistence though, as though it's getting cleared right away... 🤔 |
|
||
scheduled_fb = std::move(next_swap); | ||
next_swap = nullptr; | ||
auto const predicted_render_time = 50ms; // Predicted worst case render time for the next frame... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't actually need to hook up the recommended sleep stuff here; it's not really hooked up anywhere else and it's not really useful.
|
||
if (ret != 0) | ||
{ | ||
BOOST_THROW_EXCEPTION((std::system_error{-ret, std::system_category(), "Failed to commit Atomic KMS state"})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should at least fall back to trying again with ALLOW_MODESET
(that would require sharing some more state with configure
, notably the “mode blob”).
I don't think we need to have a fallback to drmModeSetCrtc, as we've guaranteed we have the atomic API available
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this was the cause of this issue. The kmscube people always do an allow mode set on the first render, and then turn it off afterwards. Does that sound like something we might want too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well... actually it seems like configure should do the ALLOW_MODESET for us huh?
@@ -316,6 +320,42 @@ uint32_t mgek::EGLOutput::crtc_id() const | |||
return static_cast<uint32_t>(crtc_id); | |||
} | |||
|
|||
bool mgek::EGLOutput::queue_atomic_flip(FBHandle const& fb, void const* drm_event_userdata) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be shared with configure
; it's largely the same code. Probably configure
can be changed to call this.
@RAOF I'm finding the following error when running
Update: The culprit is the Update again: After fidgeting with it/running it a lot, I can no longer reproduce, which leads me to believe it may be some sort of modesetting issue? Update again again: I see this after restarting my computer. That's a funny one! I'll see if allowing a modeset later on makes it happy? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, works¹ for me, both with just eglstream-kms
driving the display + gbm-kms
rendering, and with MIR_EXPERIMENTAL_HYBRID
so that both amdgpu and nvidia displays are active.
Just a couple of nits, plus some error handling needed.
¹: There's a malloc-corruption crash/EGL_BAD_ALLOC I see with gbm-kms, but I don't think this is related and I'll continue to investigate.
@@ -256,7 +263,7 @@ class DisplayBuffer | |||
|
|||
std::chrono::milliseconds recommended_sleep() const override | |||
{ | |||
return std::chrono::milliseconds{0}; | |||
return std::chrono::milliseconds(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This diff is not particularly necessary 😉
drmModeAtomicAddProperty(request.get(), plane_id, plane_props->id_for("CRTC_ID"), _crtc_id); | ||
drmModeAtomicAddProperty(request.get(), plane_id, plane_props->id_for("FB_ID"), fb); | ||
|
||
/* We don't monitor the DRM events (yet), so have no userdata */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually do have userdata now 😉
int ret = atomic_commit(fb, drm_event_userdata, DRM_MODE_ATOMIC_NONBLOCK | DRM_MODE_PAGE_FLIP_EVENT); | ||
if (ret != 0) | ||
{ | ||
BOOST_THROW_EXCEPTION((std::system_error{-ret, std::system_category(), "Failed to commit Atomic KMS state"})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, there are a bunch of ways atomic_commit
can fail, but transiently. It can require a modeset after switching back from VT switching. It can also fail with EBUSY
if there's already a pending commit that hasn't finished (although we shouldn't hit that, having waited for previous page flips.)
Maybe we don't have to handle EBUSY
here, and just assume that our page-filp handling will cover it.
We should handle the resume-from-VT-switch, which probably means twiddling a flag in mge::Display::resume()
that causes queue_atomic_flip
to additionally pass ALLOW_MODESET
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented something that makes a lot of sense, but I get a fatal error whenever I VT switch 🤔
display platform
af206e7
to
8361f3d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so this does look plausible, but we die with EPERM
in atomic-commit when VT switching away.
What's happening here is that we've lost DRM master, but are still in the process of scheduling a flip. We should catch EPERM
in this case or, possibly, not throw an exception at all here and just log any failure - that's what gbm-kms does.
next_swap = std::move(fb); | ||
return true; | ||
} | ||
if (auto fb = std::dynamic_pointer_cast<graphics::FBHandle>(renderable_list[0].buffer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, what? Is this just the same if
duplicated twice?
Ok, a debugging log:
|
So next steps here:
KMS kernel debugging: The kernel docs show what debug options are available. |
Urgh, there are so many ways that an atomic commit can fail, even if we do everything correctly. We *might* need to punt this way back up to the `Compositor` to handle to get the EGLStream output VT switch wortking, but that's broken on Mir release anyway 😬
We now handle `EBUSY` higher up the stack, so the `EGLOutput` can always use non-blocking commits, even for modesetting.
We fully expect VT switching to cancel flip notifications, we don't really need to log that now.
Alright! Apart from #3089 (which happens on 2.15, too 🤦♀️), this appears to entirely work for me now. 🎉 VT switching with the |
“Basic” is never the best name if there's something else you can think of, and `kms::` is a much better descriptor for it.
We require KMS “dumb” buffers for the CPU-addressable output, so we might as well check that the driver claims to support them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now good to go as far as I'm concerned. The only vaguely concerning thing is that set_flags_for_next_flip
is thread-unsafe, but this doesn't matter because it's only called from resume()
, which is called from DisplayServer::Private::resume()
, which calls it before the compositor thread(s) are started.
I can't really approve this now, having authored a bunch of it, so @AlanGriffiths ?
And I can confirm our smoke tests are fine, too :)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't really test this, but it looks reasonable.
Some nitpicking to address
src/platforms/common/server/kms_cpu_addressable_display_provider.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK now
Not merging as @mattkae is planning on some (hopefully final) testing
Got a test to fix on clang:
|
OK, I know what is different, just not (yet) worked out why... In the clang build In the gcc build, Now to figure out how this needs fixing... |
@RAOF and @AlanGriffiths: I found these two bugs in my final testing, but feel free to tackle them at a later date if you think they shouldn't block this work: |
What's new?
CPUAddressableDisplayProvider
so that it works in this specific hybrid situationHow to test
It still SIGABRTs after, but it gets through the selection part.